FSM - BASED PRONUNCIATION MODELING USING ARTICULATORY PHONOLOGICAL CODE Draft of July 1 , 2010 at 04 : 23

نویسنده

  • Mark Hasegawa - Johnson
چکیده

According to articulatory phonology, the gestural score is an invariant speech representation. Though the timing schemes, i.e., the onsets and offsets, of the gestural activations may vary, the ensemble of these activations tends to remain unchanged, informing the speech content. " Gestural pattern vector " (GPV) has been proposed to encode the instantaneous gestural activations that exist across all tract variables at each time. Therefore, a gestural score with a particular timing scheme can be approximated using a GPV sequence. In this work, we propose a pronunciation modeling method that uses a finite state machine (FSM) to represent the invariance of a gestural score. Given the " canonical " gestural score (CGS) of a word with a known activation timing scheme, the plausible activation onsets and offsets are recursively generated and encoded as a weighted FSM. An empirical measure is used to prune out gestural activation timing schemes that deviate too much from the " canonical " gestural score. Speech recognition is achieved by matching the recovered gestural activations to the FSM-encoded gestural scores of different speech contents. In particular, the observation distribution of each GPV is model by an artificial neural network and Gaussian mixture tandem model. These models are used together with the FSM-based pronunciation models in a Bayesian framework. We carry out pilot word classification experiments using synthesized data from one speaker. The proposed pronunciation modeling achieves over 90% accuracy for a vocabulary of 139 words with no training observations, out-performing direct use of the " canonical " gestural score. To my parents, for their love and support.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FSM-based pronunciation modeling using articulatory phonological code

According to articulatory phonology, the gestural score is an invariant speech representation. Though the timing schemes, i.e., the onsets and offsets, of the gestural activations may vary, the ensemble of these activations tends to remain unchanged, informing the speech content. In this work, we propose a pronunciation modeling method that uses a finite state machine (FSM) to represent the inv...

متن کامل

Phonological Awareness Impact on Articulatory Accuracy of the Spanish Liquid [r] in Japanese FL Learners of Spanish

Foreign language learners tend to avoid phonological difficulties and simply transfer sounds whether from their L1 or any pre-existing L2. Phonological awareness (PA) gives students an active role in understanding their own potential in improving pronunciation through several methods. However, such methods are likely to be restricted to only passive learning methods, such as repetition, reading...

متن کامل

Phonological Awareness Impact on Articulatory Accuracy of the Spanish Liquid [r] in Japanese FL Learners of Spanish

Foreign language learners tend to avoid phonological difficulties and simply transfer sounds whether from their L1 or any pre-existing L2. Phonological awareness (PA) gives students an active role in understanding their own potential in improving pronunciation through several methods. However, such methods are likely to be restricted to only passive learning methods, such as repetition, reading...

متن کامل

A Phonological Modeling System Based on Autosegmental and Articulatory Phonology

This paper describes the design and implementation of a phonological modeling system based on autosegmental and articulatory phonology and its application in speech recognition. Pronunciation modeling is an integral part in speech recognition systems. Together with language modeling, signal processing and learning models (e.g. Hidden Markov model and neural network model), it innuences the perf...

متن کامل

An articulatory analysis of phonological transfer using real-time MRI

Phonological transfer is the influence of a first language on phonological variations made when speaking a second language. With automatic pronunciation assessment applications in mind, this study intends to uncover evidence of phonological transfer in terms of articulation. Real-time MRI videos from three German speakers of English and three native English speakers are compared to uncover the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010